3 research outputs found
Transactional WaveCache: Towards Speculative and Out-of-Order DataFlow Execution of Memory Operations
The WaveScalar is the first DataFlow Architecture that can efficiently
provide the sequential memory semantics required by imperative languages. This
work presents an alternative memory ordering mechanism for this architecture,
the Transaction WaveCache. Our mechanism maintains the execution order of
memory operations within blocks of code, called Waves, but adds the ability to
speculatively execute, out-of-order, operations from different waves. This
ordering mechanism is inspired by progress in supporting Transactional
Memories. Waves are considered as atomic regions and executed as nested
transactions. If a wave has finished the execution of all its memory
operations, as soon as the previous waves are committed, it can be committed.
If a hazard is detected in a speculative Wave, all the following Waves
(children) are aborted and re-executed. We evaluate the WaveCache on a set
artificial benchmarks. If the benchmark does not access memory often, we could
achieve speedups of around 90%. Speedups of 33.1% and 24% were observed on more
memory intensive applications, and slowdowns up to 16% arise if memory
bandwidth is a bottleneck. For an application full of WAW, WAR and RAW hazards,
a speedup of 139.7% was verified.Comment: Submitted to ACM International Conference on Computing Frontiers
2008, http://www.computingfrontiers.org/, 20 page
Couillard: Parallel Programming via Coarse-Grained Data-Flow Compilation
Data-flow is a natural approach to parallelism. However, describing
dependencies and control between fine-grained data-flow tasks can be complex
and present unwanted overheads. TALM (TALM is an Architecture and Language for
Multi-threading) introduces a user-defined coarse-grained parallel data-flow
model, where programmers identify code blocks, called super-instructions, to be
run in parallel and connect them in a data-flow graph. TALM has been
implemented as a hybrid Von Neumann/data-flow execution system: the
\emph{Trebuchet}. We have observed that TALM's usefulness largely depends on
how programmers specify and connect super-instructions. Thus, we present
\emph{Couillard}, a full compiler that creates, based on an annotated
C-program, a data-flow graph and C-code corresponding to each
super-instruction. We show that our toolchain allows one to benefit from
data-flow execution and explore sophisticated parallel programming techniques,
with small effort. To evaluate our system we have executed a set of real
applications on a large multi-core machine. Comparison with popular parallel
programming methods shows competitive speedups, while providing an easier
parallel programing approach.Comment: 10 pages, 5 figure
Exploring the Equivalence between Dynamic Dataflow Model and Gamma - General Abstract Model for Multiset mAnipulation
With the increase of the search for computational models where the expression
of parallelism occurs naturally, some paradigms arise as options for the next
generation of computers. In this context, dynamic Dataflow and Gamma - General
Abstract Model for Multiset mAnipulation) - emerge as interesting computational
models choices. In the dynamic Dataflow model, operations are performed as soon
as their associated operators are available, without rely on a Program Counter
to dictate the execution order of instructions. The Gamma paradigm is based on
a parallel multiset rewriting scheme. It provides a non-deterministic execution
model inspired by an abstract chemical machine metaphor, where operations are
formulated as reactions that occur freely among matching elements belonging to
the multiset. In this work, equivalence relations between the dynamic Dataflow
and Gamma paradigms are exposed and explored, while methods to convert from
Dataflow to Gamma paradigm and vice versa are provided. It is shown that
vertices and edges of a dynamic Dataflow graph can correspond, respectively, to
reactions and multiset elements in the Gamma paradigm. Implementation aspects
of execution environments that could be mutually beneficial to both models are
also discussed. This work provides the scientific community with the
possibility of taking profit of both parallel programming models, contributing
with a versatility component to researchers and developers. Finally, it is
important to state that, to the best of our knowledge, the similarity relations
between both dynamic Dataflow and Gamma models presented here have not been
reported in any previous work.Comment: Study submitted to the IPDPS 2019 - IEEE International Parallel and
Distributed Processing Symposiu